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^H Abstract 

The pivot algorithm for self-avoiding walks has been implemented in a manner which is dramatically faster 

^«— ^ than previous implementations, enabling extremely long walks to be efficiently simulated. We explicitly describe 

the data structures and algorithms used, and provide a heuristic argument that the mean time per attempted pivot 

^ for N-step self-avoiding walks is 0(1) for the square and simple cubic lattices. Numerical experiments conducted 

(~] for self-avoiding walks with up to 268 million steps are consistent with o(logA') behavior for the square lattice 

O and O(logA') behavior for the simple cubic lattice. Our method can be adapted to other models of polymers with 

short-range interactions, on the lattice or in the continuum, and hence promises to be widely useful. 
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1 Introduction and results 



H The self-avoiding walk (SAW) model is an important model in statistical physics [16]. It models the excluded- 

I volume effect observed in real polymers, and exactly captures universal features such as critical exponents and 

^ amplitude ratios. It is also an important model in the study of critical phenomena, as it is the n — > limit of the 

i^ n-vector model, which includes the Ising model (n = 1) as another instance. Indeed, one can straightforwardly 

Q simulate SAWs in the infinite volume limit, which makes this model particularly favorable for the calculation of 

'— ' critical parameters. Exact results are known for self-avoiding walks in two dimensions [19, 14] and for d > 4 

I (mean-field behavior has been proved for d>5 [8]), but not for the most physically interesting case ofd — 3. 

j>. The pivot algorithm is a powerful and oft-used approach to the study of self-avoiding walks, invented by Lai [13] 

■^ and later elucidated and popularized by Madras and Sokal [17], The pivot algorithm uses pivot moves as the 

■^ transitions in a Markov chain which proceeds as follows. From an initial SAW of length A^, such as a straight rod, 

new A^-step walks are successively generated by choosing a site of the walk at random, and attempting to apply a 

lattice symmetry operation, or pivot, to one of the parts of the walk; if the resulting walk is self -avoiding the move 

J;^ is accepted, otherwise the move is rejected and the original walk is retained. Thus a Markov chain is formed in 

the ensemble of SAWs of fixed length; this chain satisfies detailed balance and is ergodic, ensuring that SAWs are 

sampled uniformly at random. 

• • One typical use of the pivot algorithm is to calculate observables which characterize the size of the SAWs: the 

. !^ squared end-to-end distance R^, the squared radius of gyration rI, and the mean-square distance of a monomer from 

^^ its endpoints R^. To leading order we expect the mean values of these observables over all SAWs of A^ steps, with 

H each SAW is given equal weight, to be {R^)n ^ D^N^^ (x e {e, g, m}), with v a universal critical exponent. 

For A^-step SAWs, the implementation of the pivot algorithm due to Madras and Sokal has estimated mean time 
per attempted pivot of (9 (A^*^'^^) onZ^ and 0{N^'^'^) on I?; performance was significantly improved by Kennedy [9] 
to 0{N°-'^^) and 0{N°-'''^) respectively. 

In this article, we give a detailed description of a new data structure we call the SAW-tree. This data structure 
allows us to implement the pivot algorithm in a highly efficient manner: we present a heuristic argument that the 
mean time per attempted pivot is (9(1) on Z^ and Z^, and numerical experiments which show that for walks of up 
to N = 2^^ — 1 « 2.7 X 10^ steps the algorithmic complexity is well approximated by 0{logN). This improvement 
enables the rapid simulation of walks with many millions of steps. 



In a companion article [4], we describe the algorithm in general terms, and demonstrate the power of the method 
by applying it to the problem of calculating the critical exponent v for three-dimensional self-avoiding walks. 

Thus far the SAW-tree has been implemented for Z^, 71? , and Z"*, but it can be straightforwardly adapted to 
other lattices and the continuum, as well as polymer models with short-range interactions. Other possible extensions 
would be to allow for branched polymers, confined polymers, or simulation of polymers in solution. 

We intend to implement the SAW-tree and associated methods as an open source software library for use by 
researchers in the field of polymer simulation. 

1.1 Pivot algorithm 

Madras and Sokal [17] demonstrated, through strong heuristic arguments and numerical experiments, that the pivot 
algorithm results in a Markov chain with short integrated autocorrelation time for global observables. The pivot 
algorithm is far more efficient than Markov chains which utilize local moves; see [17, 15, 22, 23] for detailed 
discussion. 

The implementation of the pivot algorithm by Madras and Sokal utilized a hash table to record the location of 
each site of the walk. They showed that for A^-step SAWs the probability of a pivot move being accepted is 0{N^'^), 
with p dimension-dependent but close to zero {p < 0.2). As accepted pivots typically result in a large change in 
global observables such as iJg, this leads to the conclusion that the pivot algorithm has integrated autocorrelation 
time 0{N''), with possible logarithmic corrections. In addition, they argued convincingly that the CPU time per 
successful pivot is 0{N) for their implementation. Throughout this article we work with the mean time per attempted 
pivot, T{N), which for the Madras and Sokal implementation is 0{N^^p). 

Madras and Sokal argued that 0{N) per successful pivot is best possible because it takes time 0{N) to merely 
write down an A^-step SAW. Kennedy [9], however, recognized that it is not necessary to write down the SAW for 
each successful pivot, and developed a data structure and algorithm which cleverly utilized geometric constraints to 
break the 0{N) barrier In this paper, we develop methods which further improve the use of geometric constraints 
to obtain a highly efficient implementation of the pivot algorithm. 

1.2 Results 

We have efficiently implemented the pivot algorithm via a data structure we call the SAW-tree, which allows rapid 
Monte Carlo simulation of SAWs with millions of steps. This new implementation can also be adapted to other 
models of polymers with short-range interactions, on the lattice and in the continuum, and hence promises to be 
widely useful. 

The heart of our implementation of the algorithm involves performing intersection tests between "bounding 
boxes" of different sub-walks when a pivot is attempted. In [4] we generated large samples of walks with up to 
2^^ — l«3.3xlO^ steps, but for the purpose of determining the complexity of our algorithm we have also generated 
smaller samples of walks of up to 2 — 1 w 2.7 x 10 steps. For A^ = 2 — 1, the mean number of intersection tests 
needed per attempted pivot is remarkably low: 39 for I?, 158 for Z^, and 449 for Z'*. 

In Sec. 3 we present heuristic arguments for the asymptotic behavior of the mean time per attempted pivot 
for A^-step SAWs, T{N), and test these predictions with computer experiments for A^ < 2.7 x 10^. We summarize 
our results in Table 1; note that 0{f{N)) indicates T is bounded above by / asymptotically, o{f{N)) indicates 
/ dominates T, a>{f{N)) indicates T dominates /, and &{f{N)) indicates / bounds T both above and below. 
For comparison, we also give the algorithmic complexity of the implementations of Madras and Sokal [17], and 
Kennedy [9]. In Sec. 3.5, we develop an argument for the complexity of our algorithm on I?; this same argument 
leads to an estimate for the performance of the implementation of Madras and Sokal on Z*. We do not know the 
complexity of Kennedy's implementation for I? and Z'' with d > A, but we suspect it is 0{N'') with 0.74 < q < 1, 
with possible logarithmic corrections. 

Our implementation is also fast in practice: for simulations of walks of length 2^*^ w 10^ on 7?, our imple- 
mentation is almost 400 times faster when compared with Kennedy's, and close to four thousand times faster when 
compared with that of Madras and Sokal. We have measured r(A^) for each implementation over a wide range of A^ 
on I?, I?, and I?, and report these results in Sec. 6. 

1.3 Outline of paper 

In Sec. 2, we give a detailed description of the SAW-tree data structure and associated methods which are required 
for implementing the pivot algorithm. 

In Sec. 3 we present heuristic arguments that T{N) for self-avoiding walks on Z^ and Z^ is (9(1), and numerical 
evidence which shows that for walks of up to A^ = 2^^ - 1 w 2.7 x 10^ steps T{N) is o(logA^) for Z^ and C>(logA^) 
for I?. We also discuss the behavior of our implementation for higher dimensions. 



Table 1: T{N), the mean time per attempted pivot for A^-step SAWs. A tighter bound for Z^ is reported in Sec. 3.5, 
but this relies on an untested assumption. 



Lattice Madras and Sokal Kennedy This work 

Predicted Observed 



,d>4 



O(A?0-8i) 


C»(A?0.38) 


0(1) 


o(logA^) 


O(A?0-89) 


(9(A?0.74) 


0(1) 


0{logN) 


o{N) 


7 


o(logA^) 


w{\ogN) 


OiN) 


7 


0(logA?) 


7 



In Sec. 4 we discuss initialization of the Markov chain, including details of how many data points are discarded. 
We also explain why it is highly desirable to have a procedure such as Pseudo_dimerize for initialization (pseudo- 
code in Sec. 2.7) when studying very long walks, and show that the expected running time of Pseudo_dimerize is 
0(A^). 

In Sec. 5 we discuss the autocorrelation function for the pivot algorithm, and show that the batch method for 
estimating confidence intervals is accurate, provided the batch size is large enough. This confirms the accuracy of 
the confidence intervals for our data published in [4]. 

Finally, in Sec. 6 we compare the performance of our implementation with previous implementations of the 
pivot algorithm [17, 9]. We show that the SAW-tree implementation is not only dramatically faster for long walks, 
it is also faster than the other implementations for walks with as few as 63 steps. 

2 Implementation details 

Self-avoiding walks (SAWs) are represented as binary trees (see e.g. [21]) via a recursive definition; we describe 
here the SAW-tree data structure and associated methods using pseudo-code. 

These methods can be extended to include translations, splitting of walks, joining of walks, and testing for 
intersection with surfaces. Indeed, for SAW-like models (those with short range interactions), it should be possible 
to implement a wide variety of global moves and tests for SAWs of A^ steps in time 0{logN) or better. It is also 
possible to parallelize code by, for example, performing intersection testing for a variety of proposed pivot moves 
simultaneously. Parallelization of the basic operations is also possible, but would be considerably more difficult to 
implement. 

In this section we give precise pseudo-code definitions of the data structure and algorithms. For reference, R- 
trees [7] and bounding volume hierarchies (see e.g. [10]) are data structures which arise in the field of computational 
geometry which are related to the SAW-tree. 

2.1 The self-avoiding walk 

For self-avoiding walks, the self-avoidance condition is enforced on sites rather than bonds, and this means that the 
SAW-tree is naturally defined in terms of sites. This representation also has the advantage that the basic objects, 
sites, have physical significance as they correspond to the monomers in a polymer. The only consequences of this 
choice are notational: a SAW-tree of n sites has n — 1 steps. We adopt this notation for the remainder of this section. 
When discussing the complexity of various algorithms we will still use A^ rather than n in order to be consistent with 
the companion article and other sections of the present work. 

An «-site SAW on Z'' is a mapping O) : {0, 1, . . . ,« - 1} ^- Z'' with |a)(/+ 1) — (b(/)| — 1 for each / (|;c| denotes 
the Euclidean norm of x), and with co{i) ^ (o{j) for all / ^ j. SAWs may be either rooted or unrooted; our convention 
is that the SAWs are rooted at the site which is at xi (unit vector in the first coordinate direction), i.e fi)(0) = xi. 
This convention simplifies some of the algebra involved in merging sub-walks, and is represented visually, e.g. in 
Fig. 1, by indicating a dashed bond from the origin to the first site of the walk. 
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Figure 1: A self-avoiding walk of 5 sites, which we will refer to as ©„. 



We denote the group of symmetries of H' as Gj, which corresponds to the dihedral group for d = 2, and the 
octahedral group for li = 3. This group acts on coordinates by permuting any of the d coordinate directions (dl 
choices), and independently choosing the orientation of each of these coordinates (2'' choices); thus G^/ has 2''dl 
elements. The group of lattice symmetries for Z^ therefore has 48 elements, and we use all of them except the 
identity as potential pivot operations; other choices are possible. We can represent the symmetry group elements as 
d xd orthogonal matrices, and the symmetry group elements act on the coordinates written as column vectors. 

We also define the (non-unique) pivot sequence representation of a self-avoiding random walk on Z'' as a map- 
ping from the integers to G^/, (7(b : {0, 1, . . . ,« — 1} — > G^. The sequence elements q{i) represent changes in the 
symmetry operator from site /—I to site /, while qabs{i) represent absolute symmetry operations, i.e. relative to the 
first site of the walk. We can relate this to the previous definition of a self-avoiding walk in terms of sites via the 
recurrence relations 

?abs(') = ^abs(' - 1)^(0' (1) 

a)(/) = a)(/-l) + ^abs(Oxi, (2) 

with 1 </<«—!, and initial conditions q{Q) — I, g'abs(O) = /, and ©(0) = xi. 

As noted by Madras and Sokal (footnote 10, pl32 in [17]), for the pivot sequence representation it is possible 
to perform a pivot of the walk in time (9(1) by choosing a site / uniformly at random, and multiplying q{i) by a 
(random) symmetry group element. However, the pivot sequence representation does no better than the hash table 
implementation of Madras and Sokal if we wish to determine if this change results in a self-intersection, or if we 
wish to calculate global observables such as R^ for the updated walk. 

Forgetting for the moment the self-avoidance condition, and using the fact that G^/ has 2'^d\ elements, we see 
that for random walks of « sites there are {2^d\Y^^ possible pivot sequences, while there are only (2(i)"^' random 
walks. This suggests that each random walk is represented by {2^'^^{d— 1)!)"^' pivot sequences. This can be 
derived directly by noting that given a pivot sequence q{\),q{2),q{y), ■ ■ ■ ,q{k),q{k+ !),■■■ ,q{n— 1), we can insert 
a pivot q which preserves the vector xi, between two elements q{k) and q{k+ 1) as follows 

q{0),q{l),q{2),q{3),--- ,q{k)q,q'^q{k+l),- ■ ■ ,q{n-l), 

without altering the walk. The number of symmetry group elements which preserve xi is 2'^d\/{2d) = 2''^^{d— 1)!, 
and there are n — 1 locations where these symmetry group elements can be inserted, leading to {2^^^(d— 1)!)"^' 
equivalent pivot representations for a random walk of n sites. Fore/ = 2, given a)(/) the recuiTence relations inEqs. 1 
and 2 only fix one of the two non-zero elements in q{i), leaving the choice of sign for the other non-zero element 
free. For our example walk ©„ we have 

0),, = ((1,0), (1,1), (2,1), (3,1), (3,0)). (3) 

We give three of the 16 equivalent choices for the pivot representation of (Oa, the first involving only proper rota- 
tions, the second with improper rotations for q{i) with 1 < / < 4, and the third with proper and improper rotations 
alternating: 

(2) //I oWo i\/o I \ ( I oWo 1 



^--^0 i)\i o)\i o)\o -1 j'U ))' ^'^ 

(3) //I oWo i\/o -iWi oW 1 



^--^0 ij'li o}\i oj'U -Ij'Ui ojj- (^) 

The non-uniqueness of the pivot representation for SAWs is due to the fact that the monomers (occupied sites) 
are invariant under the symmetry group Gj, i.e. it is not possible to distinguish the different orientations of a single 
site. The non-uniqueness is of no practical concern, but perhaps hints that it may be possible to derive a more 
succinct and elegant representation of walks than the mapping to Gj defined here. 

The merge operation is the fundamental operation on SAWs which allows for the binary tree data structure we 
call the SAW-tree. This is related to the concatenation operation defined, for example, in Sec. 1.2 of [16]; for 
concatenation the number of bonds is conserved, whereas for the merge operation the number of sites is preserved. 
Merging two SAWs with n and m sites respectively results in a SAW with «+w sites. It is convenient to also include a 
pivot operation, q, when merging the walks, and the result of merging two walks (i)\ — {(Oi (0), ©i (1), • • • , ©i (« — 1)) 
and©2 = (©2(0),CO2(l),--- ,C02(m-l))is 

merge (©1,^,0)2) = (©i(0),©i(l), • • • ,©i(n- l),©i(n - 1) +^©2(0), 

©i(n-l)+^©2(l),--- ,©i(n-l)+?©2(OT-l)). (7) 



The merge operation is represented visually in Fig. 2. To merge two sub-walks, pin the open circle of the left- 
hand sub-walk to the origin, and then pin the open circle of the right-hand sub-walk to the tail end of the left-hand 
sub-walk. Finally, apply the symmetry q to the right-hand sub-walk, using the second pin as the pivot. 
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Figure 2: Examples of the merge operation on SAWs: the open circle (empty site) of the right-hand sub-walks 
are fixed to the ends of the corresponding left-hand sub-walks, and the symmetry is then applied to the right-hand 
sub-walk. At the top, an identity symmetry operation is applied, while at the bottom the symmetry operation is a 
180° rotation. 



2.2 Definitions 

Here we define various quantities which are necessary for implementing our data structure and for calculating 
observables such as the mean-square end-to-end distance, R^. 

We first define various quantities which will be used to calculate observables which measure the size of a walk: 



Xe(a)) = (o{n- 1) 

(=0 

n-1 

X2{(0)= Y^ (0{l) ■ 0}{i) 



(vector); 
(vector); 

(scalar). 



(8) 
(9) 

(10) 



A bounding box of a walk is a convex shape which completely contains the walk. The obvious choice of shape 
for l/ is the rectangular prism with faces formed from the coordinate planes x, = const, I <i<d, with the constants 
chosen so that the faces of the prism touch the walk, i.e. the bounding box has minimum extent. Other choices are 
possible, e.g. other planes can be used such as x,- ± xy = const, 1 < / < ./' < d, and have the advantage of matching 
the shape of the walk more closely, but at the expense of more computational overhead and memory consumption. 
With closer fitting bounding boxes, fewer intersection tests need to be performed to ascertain whether two walks 
intersect. However, in practice, the coordinate plane rectangular prism implementation was fastest on our computer 
hardware (by a narrow margin), and has the benefit that it is straightforward to implement. The choice of bounding 
box for continuum models is not as obvious; possibilities include spheres and oriented rectangular prisms. 

We note that the choice of bounding box shape determines the maximum number of sites, b, a SAW can have so 
that it is guaranteed that its bounding box contains the sites of the SAW and no others. Suppose we are given two 
SAWs for which the bounding boxes overlap: if each of the walks has b or fewer sites, we can be certain that the two 
walks intersect, while if at least one of the walks has more than b sites, it may be that the walks do not intersect. The 
value of b determines the cut-off for intersection testing for the function Intersect in Sec. 2.6. For Z'' with d>2, 
the bounding box with faces formed from the coordinate planes leads to the maximum number of sites being two, as 
there are counter-examples with three sites (e.g. the bounding box of © = ( (0, 0) , ( 1 , 0) , ( 1 , 1 ) ) also contains (0,1)). 
For the bounding box with the faces being the coordinate planes and x,- ± x^ = const, the maximum number of sites 
is three (as the bounding box of O) — ((0,0), (1,0), (2,0), (2, 1)) also contains (1,1)). It is possible to push this one 
step further so that the maximum number of sites is four, but five is not possible as we can see that a)„ in Fig. 1 has 
five sites, and an unvisited site on its convex hull, which must also therefore be interior to any bounding box. 



We write bounding boxes as a product of closed intervals, in the form B{co) = x [infx, : x € (B,supx/ : x G ft)], 
where the product is taken over 1 < / < li. Consider a walk co, with bounding box B, which is split into left- and right- 
hand sub-walks, 0)' = (a)(0),- •• ,(a{k—l)) and co'' — {(a{k),--- ,co{n—l)), with bounding boxes fl' = x[fl,-,fe,], and 
B' = X [ci,di] respectively. We can then define the union operation on bounding boxes. 



b = b'ub' 



X i[ai,bi]U[ci,di]) 

x[inf{fl,-,c,},sup{fe,-,(i,}]. (11) 



The intersection operation is defined as 



B'nB''=x{[ai,bi]n[ci,di]) 

= x[sup{a;,c,},inf{Z7,,c/,}]. (12) 

There is no guarantee that sup{fl,,c,} < inf{bi,di}, and we adopt the convention that an interval [e,f] is considered 
empty if e > /. If any interval is empty, then the corresponding bounding box is also empty as it contains no interior 
sites. A quantity associated with the bounding box which we will find useful is the sum of the dimensions of the 
bounding box, Perim. If B = x [fl,-,fe,], then we define 

d 

Perim(B) = £(^-fl, + l). (13) 

1=1 

For cOa (in Fig. 1) we have the following values for the various parameters: 

n = 5; (14) 

B(a)„)-[l,3]x[0,l]; (15) 

Xe(a)„)-(3,0); (16) 

X(a)„)- (1,0) + (1,1) + (2,1) + (3,1) + (3,0) 

-(10,3); (17) 

X2(0)„) = (1,0)-(1,0) + (1,1)-(1,1) + (2,1).(2,1) 
+ (3,l)-(3,l) + (3,0)-(3,0) 
= 1+2 + 5 + 10 + 9 
= 27. (18) 

The observables R^'^, with x G {e,g,m}, 1 < ^ < 5, may be straightforwardly calculated from Xg, X, and X2. We 
give expressions for R^ with x e {e,g,m}, and note that higher Euclidean-invariant moments can be obtained via 

^x'' — i^x) (2 < ^ < 5) (these moments are calculated for I? in [3] and for Z^ in [4]). In addition we introduce 
another observable, ^^, which measures the mean-square deviation of the walk from the endpoint (0{n — l). 

= (Xe-Xi).(Xe-Xl) (19) 

1 n— 1 



4 = ^2LM0-coU)\' 



i.j=Q 

= -X2-\x-X (20) 

n n^ 

^m = fL[l«(0-«(O)l' + l«(0-«(«-l)fl 

= ^ + ^Xe-Xe--Xi-X--Xe-X+-X2 (21) 

2 2 n n n 



1 n— 1 



^m = -El«(0-«(«-l)l 



2 



= Xe-Xe--Xe-X+-X2 (22) 

n n 

In [4], we chose to calculate ^^ rather than R^, as it has a slightly simpler expression, and relied on the identity 
(^^) = {R^ . Compared with R^, ^^ has larger variance but smaller integrated autocorrelation time (for the pivot 



algorithm). Before performing the computational experiment in [4], we believed that given the same number of pivot 
attempts the confidence intervals for (^^) and {R^) would be comparable. We have since confirmed that working 
directly with R^ results in a standard error which is of the order of 17% smaller for li = 3, an amount which is not 
negligible; in future experiments we will calculate R^ directly. 

2.3 Guide to the interpretation of pseudo-code 

Here follow some comments to aid in the interpretation of the pseudo-code description of the SAW-tree data structure 
and associated algorithms. 

• All calls are by value, following the C programming language convention. Data structures are passed to 
methods via pointers. 

• Pointers: the walk w is a data structure whose member variables can be accessed via pointers, e.g. the vector 
for the end-to-end distance for the walk w is vv^Xg. The left-hand sub-walk of w is indicated by w', and the 
right-hand sub-walk by w''. This notation is further extended by indicating w" for the left-hand sub-walk of 



/ Ir 
W , W 



for the right-hand sub-walk of w', etc. 



• Suggestive notation for member variables used to improve readability; all quantities, such as "Xe" (the end-to- 
end vector) must correspond to a particular walk w. e.g. Xg = w^Xe, X^ = w'^Xe (i.e., superscript / indicates 
that Xg is the end-to-end vector for the left sub-walk), q' = w'^q, n' = w'^n. 

• Variables with subscript t are used for temporary storage only. 

• Comments are enclosed between the symbols /* and */ following the C convention. 

• Boolean negation is indicated via the symbol "!", e.g. ! TRUE = FALSE. 

2.4 SAW-tree data structure 

The key insight which has enabled a dramatic improvement in the implementation of the pivot algorithm is the 
recognition that sequences of sites and pivots can be replaced by binary trees. The n leaves of the tree are individual 
sites of the walk, and thus encode no information, while each of the n— \ (internal) nodes of the tree contain 
aggregate information about all sites which are below them in the tree. We call this data structure the SAW-tree, 
which may be defined recursively: a SAW-tree of n sites either has n = \ and is a leaf, or has a left child SAW-tree 
with Q <k <n sites, and a right child SAW-tree with the remaining n — k sites. 

Our implementation of the SAW-tree node is introduced in Table 2. A SAW-tree consists of one or more SAW- 
tree nodes in a binary tree structure; the pointers w' and w'' allow traversal from the root of the tree to the leaves, 
while w'' allows for traversal from the leaves of the tree to the root. SAW-trees are created by merging other SAW- 
trees, with a symmetry operation acting on the right-hand walk. In particular, any internal node w may be expressed 
in terms of its left child w', a symmetry operation q = w^q, and its right child w'' via a merge operation: 

w = merge(w ,^,w'^). (23) 



Table 2: Definition of the SAW-tree node data structure; see Sec. 2.2 for definitions of each of the member variables. 
The node contains variables which are required for traversal of the SAW-tree (w'\ w', w''), intersection testing («, q, 
Xg, B), and calculation of observables R\ with x G {e, g, m} (n, Xg, X, X2). If the only observable to be calculated is 
7?g, then X and X2 can be omitted. 



SAW-tree node data structure 


Type Name 


Description 


integer n 


Number of sites 


SAW-tree ptr w^ 


Parent 


SAW-tree ptr w' 


Left-hand sub-walk 


SAW-tree ptr w' 


Right-hand sub-walk 


matrix q 


Symmeti-y group element 


vector Xg 


co{n-l) 


vector X 


x=r:zocoii) 


integer X2 


X2=riZdco{i)-(oii) 


bounding box B 


Convex region 



The leaves of the SAW-tree correspond to sites in a SAW, and are thus labeled from to « — 1. A binary tree with 
n leaves has « — 1 internal nodes, and we label these nodes from 1 to n — 1, so that the symmetry q{i) is to the left 
of a)(/), a)(/ + 1), • • • , (b(« — 1). The symmetry ^(0) is not part of the SAW-tree as it is applied to the whole walk, 
and thus cannot be used in a merge operation. For some applications it may be necessary to keep track of ^(0), e.g. 
when studying polymers in a confined region, but in [4] this was not necessary. 

Assume that the end-to-end vectors, Xg, and symmetry group elements, q{i), for a SAW-tree and its left and 
right children are given. If we know the location of the anchor site of the parent node, Xabs, along with the overall 
absolute symmetry group element g'abs being applied to the walk, we can then find the same information for the left 
and right children as follows: 

i^ert. ^abs "^ ^abs 

g'abs -^ ?abs 

Right: .Xabs -^J^abs + ^absXe 

?abs -^ qAbf.q'' 

Thus ©(/) can be determined for any site / by iteratively performing this calculation while following the (unique) 
path from the root of the SAW-tree to the appropriate leaf. N.B.: Xabs must be updated before ^abs- 

We give explicit examples of SAW-trees in Appendix A. In Fig. 22, we give a SAW-tree representation of a 
SAW with n sites which is precisely equivalent to the pivot sequence representation. We also give two equivalent 
representations of (B„ (shown in Fig. 1) in Figs. 23 and 24. 

Conceptually we distinguish single-site walks (individual sites), which reside in the leaves of the tree, from 
multi-site walks.' In particular, the symmetry group element of a single site has no effect, and in the case where all 
monomers are identical then all single sites are identical. 

If the SAW-tree structure remains fixed it is not possible to rotate part of the walk by updating a single symmetry 
group element, in contrast to the pivot sequence representation. This is because when we change a symmetry group 
element in a given node, it only alters the position of sites which are in the right child of the node. To rotate the part 
of the walk with sites labeled i + 1 and greater, we choose the /* internal node of the SAW-tree from the left. We 
then need to alter the symmetry group element of this node, and also all nodes which are above and to the right of it 
in the SAW-tree. If we select a random node then it will likely be near the leaves of the tree, and assuming that the 
SAW-tree is balanced this means that on average O(logA^) symmetry group elements will need to be altered. 

However, we note that the root node at the top of the tree has no parents, and therefore only one symmetry group 
element needs to be altered to rotate the right-hand part of the walk in this case. By utilizing tree-rotation operations, 
which alter the structure of the tree while preserving node ordering, it is possible to move the !* node to the root of 
the SAW-tree. Once this has been done, it is then possible to implement a rotation of part of the walk by updating a 
single symmetry group element. On average, 0{\ogN) of these tree-rotation operations are required. 

Binary trees are a standard data structure in computer science. By requiring trees to be balanced, i.e. so that 
the height of a tree with A^ nodes is bounded by a constant times logA^, optimal bounds can be derived for oper- 
ations such as insertion and deletion of nodes from the tree. We refer the interested reader to Sedgewick [21] for 
various implementations of balanced trees, such as red-black balanced trees. We have the advantage that our SAW- 
tree is, essentially, static, which means that we can make it perfectly balanced without the additional overhead of 
maintaining a balanced tree. 

2.5 Primitive operations 

Included in this subsection are the primitive operations, which would generally not be called from the main program. 
Left and right tree-rotations are modified versions of standard tree operations; for binary trees, only ordering 
needs to be preserved, while for SAW-trees the sequence of sites needs to be preserved, which means that symmetry 
group elements and other variables need to be modified. 
Procedure: 
Merge(5AW-free pfr w', SAW-tree ptr w'', SAW-tree ptr w) 

I* Two SAWs are joined together, head to tail. Merge w' and w'' into w, i.e. w <= merge(w',g',w'). 
Pointers are not altered. See Eq. 7 and Fig. 2. */ 



n 


<= n' +n' 


B 


■i= B'U(X^+^B'-) 


Xe 


<= % + qK 


X 


■^ X'+qX^ + n^Xl 


X2 


<= X{+X; + 2X',-{qX' 



-n'K-Xl 
End /* Merge 

'Technical note: as the leaves are all identical, in practice we use a sentinel node for the leaves, saving on memory usage. 
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Figure 3: Left tree-rotation applied to a SAW-tree, where A, B, and C are arbitrary SAW-trees. LHS and RHS are 
different representations of the same self-avoiding walk. 



Procedure: 

\M{SAW-tree ptr w) 

I* Left tree -rotation applied to w. Update of pointers to parents not shown, and note that only one merge 
operation is necessary. The pseudocode is faithful to the implementation used in the present work 
and [4], but the definition of this operation is likely to change in future implementations. In Fig. 3, 
w refers to the node with symmetry q\ on the LHS, and q\q2 on the RHS. In future, we will adopt 
the convention that w always refers to the same node with respect to left-right ordering. By this 
convention, w would refer to node with symmetry qi on the RHS. 
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Figure 4: Right tree-rotation applied to a SAW-tree, where A, B, and C are arbitrary SAW-trees. LHS and RHS are 
different representations of the same self-avoiding walk. 



Procedure: 

RR(SAW-tree ptr w) 

I* Right tree-rotation applied to w. Update of pointers to parents not shown. Note that only one merge 
operation is necessary. The pseudocode is faithful to the implementation used in the present work and 
[4], but the definition of this operation is likely to change in future implementations. In Fig. 4 w refers 
to the node with symmetry q2 on the LHS, and q\ on the RHS. In future, we will adopt the convention 
that w always refers to the same node with respect to left-right ordering. By this convention, w would 
refer to node with symmetry q^qi on the RHS. 



*/ 



Wt 


<^ 


w' 


w' 


<= 


Wt^W 


Wt^W 


<= 


Wt^W^ 


Wt^w'' 


<= 


w' 


w' 


^ 


w, 


% 


<= 


q 


q 


<= 


q' 


q' 


<= 


q ^qt 


Merge(w''' 


>-, 


,wO 


1 /*RR 







*/ 

Function: 

Find_node(/nfeger «,, SAW-tree ptr w) 

I* Returns a pointer to the n* node from the left in the SAW-tree w. This may either be implemented as a 
numerical function, if the address of the correct node can be easily determined (such as if the nodes of 
w are arranged in memory in pre-order fashion), or returned from a look-up array. This look-up array is 
static, and so will not need to be updated after it is created. Note: we require this function to take time 
(9(1), and thus it is inappropriate to use a binary search implementation which may take time 0{\ogN). 
This function is required by Attenipt_pivot_fast. */ 

2.6 User level operations 

Included in this subsection are user level operations which would typically be called from the main program. 

Procedure: 

Generate_SAW-tree(/nfeger n) 

I* Allocates memory for the SAW-tree, and creates the initial arrangement of the tree in memory. We use 
a strictly balanced tree layout, which guarantees O(logA^) behavior for basic operations. We tested pre- 
order and van Emde Boas layouts; given that performance for large A^ was memory bound, we were 
surprised to find that the pre-order layout was fastest, although we will experiment more with this in the 
future (the van Emde Boas tree layout [24] is an example of a cache-oblivious data structure [5, 12]). */ 

Function: 

Randoni_integer_uniforni(/nfeger a, integer b) 

I* Returns an integer in the interval [a,b) selected uniformly at random. */ 

Function: 

Randoni_integer_log(/nfeger a, integer b) 

I* Returns an integer / selected in the interval [a,b) with probability 

\og{b-a + \) 

This probability distribution has the property that if we set a = 1 for convenience and choose a length 
scale \ <x <2x<b, then 

log(2x)— logx 



\ogb 
log2 



(25) 



log/?' 
i.e. the probability of ; lying in the semi-open interval [jc, 2x) is independent of x. */ 

Function: 
Random_symmetry() 

/* Symmetry selected uniformly at random, excluding identity. Other choices are possible. */ 

Procedure: 

Accuniulate_statistics(5'AW-free w. Boolean success) 



10 



/* Accumulate statistics for Euclidean-invariant moments R^'' with x e {e,g,m}, and I < k < 5. When 
the pivot attempt is unsuccessful there is no need to recalculate observables; in fact, by using a counter, 
updates to storage variables only need to be made when success =TRUE, i.e. when the last pivot attempt 
was successful. */ 
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Figure 5: Shuffle up operation applied to node with symmetry qj, in a SAW-tree, via a sequence of left and right 
tree-rotations, with A, B, C, and D arbitrary SAW-trees. LHS and RHS are different representations of the same 
self-avoiding walk. 



Procedure: 

Shuffle_up(/nfeger no, SAW-tree ptr w) 

/* Brings node no to the root of the SAW-tree, via a series of tree-rotation operations. See Fig. 5. 
if no = n' then 

/* Do nothing, node already at root. 
else if no < n' then 
Shuffle_up(no,w') 
RR(w) 
else if no > n' then 

Shuffle_up(no — n' ,w^) 
LR(w) 
end if 
Return 
End /* Shuffle_up 



*/ 
*/ 



*/ 



Procedure: 

Shuffle_down(5AW-free ptr w) 

/* Restores node at root to its correct place in the balanced SAW-tree, via a series of tree-rotation oper- 
ations. Note that this operation is only necessary if one wishes that the SAW-tree remains balanced, 
and must be paired with Shuffle_up. */ 

n,^L(n + l)/2j 
if n, = n' then 

/* Do nothing, node is in correct place. */ 

else if n, < n' then 

RR(vv') 

Shuffle_down(w'^) 
else if n, > n' then 

LR(w) 

Shuffle_down( w' ) 
end if 
Return 
End /* Shuffle_down */ 

The function Intersect is at the heart of our implementation of the pivot algorithm; we recommend the reader 
consult the start of Sec. 3.2 for an explanation of how Intersect works. 
Function: 

Intersect (vector x[^^^, matrix q^]^^, SAW-tree ptr w , 
vector x''^^^, matrix q'^^f^^, SAW-tree ptr wO 
/* Returns TRUE if w' and w'' intersect, FALSE otherwise, x'^^^ and x''^^^^ are the absolute positions of the 
anchor points of the walks w' and w'', and q^^^^ and q''^^^^ are overall symmetry group elements. */ 
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/* First, calculate the absolute positions of the bounding boxes. If they do not intersect, then w' and w'' 
cannot intersect. */ 

B',^x'+q'B'' 
ifB[nB[ = 0then 
Return FALSE 
end if 
ifn' < 2 and ;/ < 2 then 

/* The bounding boxes of w' and w'' intersect, and thus w' and w'' must intersect as each walk has two 
or fewer sites. The cut-off is dependent on the shape of the bounding box, and is the maximum 
number of sites a SAW can have which guarantee that the bounding box contains only the sites of 
the SAW and no others. For Zf' the value of the cut-off for the rectangular prism is two as given 
here; see Sec. 2.2 for further discussion. */ 

Return TRUE 
end if 
if «' > «'^ then 

/* Split the left SAW-tree; compare the SAW-trees which are closest together on the chain first, as 

they are the most likely to intersect. */ 

if Intersect(4b^ + ^^i,^X^', q[^^q', w'', ^^s. ?abs' >^') then 

Return TRUE 
end if 
Return Intersect(j:^|,^, ^^^^,, w", x^^^, q^^^^, w') 

else 

/* Split the right SAW-tree; compare the SAW-trees which are closest together on the chain first, as 

they are the most likely to intersect. */ 

if Intersect(x^bs' ?L' ^'' <bs' ?abs' ^^'■') then 

Return TRUE 
end if 

Return Intersect(x^|,^, q'^^, w', x[^^ + ql^,^X'J, q'^^^q', w") 
end if 
End /* Intersect */ 

Shuffle_intersect is akin to the procedure developed by Madras and Sokal [17] for intersection testing: by 
building new walks incrementally moving outwards from the pivot site they achieved a speed-up from 0{N) to 
0[N^^P) for unsuccessful pivot attempts. In Sec. 3 we present a heuristic argument that the speed-up achieved for 
the SAW-tree implementation is from 0{\ogN) to 0(1). 
Function: 

Shuffle_intersect(5'AW-free ptr w, symmetry qo, integer was_left_child, 
integer is_left_child) 
/* Hybrid algorithm which combines elements of Shuffle_up and Intersect, w is the pivot node, and 
qQ is the pivot operation which acts on the part of the walk to the right of w (both ancestors and 
descendants). was_left_child and is_left_child are integer flags specifying the local tree structure. 
Function returns true if the left- and right-hand walks intersect, FALSE otherwise. The pseudo- 
code here reproduces the method, but some small-scale optimizations have been omitted for the sake 
of clarity. In particular, some computations are performed more often than strictly necessary, e.g. 
rotations of bounding boxes. For 1?, this results in the implementation here running approximately 
40% slower than the optimized implementation. By using temporary variables to store the modified 
tree nodes, we guarantee that the original walk is left unmodified. Thus it is safe to execute concurrent 
versions of this function on the same SAW-tree in parallel. */ 

/* Check if the left- and right- children of w intersect. */ 

if was_left_child = 1 then 

/* We have already verified that w' and w*^' do not intersect; check if w' and w"^ intersect. */ 

if lntersect(0, /, w', X'^ + qq^X''^ , qqoq'', w'"'') then 

Return TRUE 
end if 
else if was_left_child = then 

/* We have already verified that w'*^ and w'' do not intersect; check if w" and w'' intersect. */ 

if lntersect(0, /, w", Xg, qq^, w') then 
Return TRUE 
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end if 
else if was_left_child = — 1 then 

/* First call of this function, and so we have no prior information; check if w' and w' intersect. */ 

if lntersect(0, /, w', Xg, qq^, w'') then 
Return TRUE 

end if 
end if 

/* Check to see if we have reached the top of the tree. */ 

if w'' = NULL then 

Return FALSE 

end if 

/* Not at top of tree, and so we need to perform a left or right rotation. First we need to determine if w^ 

is a left or right child. */ 

if wPP = NULL then 

/* wP is the root of the tree so is_left_child_new will not be needed. */ 

else if w''P' = w^ then 

is_left_child_new <= TRUE 
else 

is_left_child_new <i= FALSE 
end if 
/* Create a temporary SAW-tree node w, which is a copy of w^ so we do not alter the original walk. */ 

Wt -4= w'' 

if is_left_child then 

/* right tree-rotation needed. */ 

RR(w,) 
else 

/* left tree-rotation needed. */ 

LR(h',) 
end if 

Return Shuffle_intersect(w,,^o4s_left_child, is_left_child_new) 
End /* Shuffle_intersect */ 

2.7 High level functions 

Here follows a pseudo-code description of three routines which provide high level functionality for typical usage of 
the pivot algorithm. 

We define two versions of the function Attenipt_pivot, one which is conceptually simple, while in Sec. 3 we 
argue that the other version is asymptotically faster for 7? and 7? . 
Function: 
Attenipt_pivot_siniple(5'AW-freepfr w, integer rit, symmetry qt) 

I* Attempt to apply a pivot to w at site n, with symmetry qt; if successful update the walk, otherwise 
leave walk unchanged. Return a boolean value to indicate whether the pivot operation successfully 
changed the walk. */ 

Shuffle_up(n,,w) 

q^qqt 

intersection <= lntersect(0, /, w', Xg, q, w'') 

if intersection then 

/* Reject pivot, restore original symmetry. */ 

q^qq^^ 
end if 
Shuffle_down(w) 

/* Pivot was successful if sub-walks did not intersect. */ 

Return ! intersection 
End /* Attempt_pivot_simple */ 

Function: 

Attenipt_pivot_fast(5AW^-free/?fr w, integer nt, symmetry qt) 
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/* Attempt to apply a pivot to w at site «( with symmetry qf, if successful update the walk, otherwise 
leave walk unchanged. Return a boolean value to indicate whether the pivot operation successfully 
changed the walk. */ 

Wt <^ Find_node(nr) 
if wf ' = NULL then 

/* Dummy argument, is_left_child will not be required. */ 

is_left_child <= TRUE 
else if wf = w, then 

is_left_child 4= TRUE 
else 

is_left_child -^ FALSE 
end if 

intersection <= Shuffle_intersect(W(,(7,,-l,is_left_child) 
if !intersection then 

/* Accept pivot, perform symmetry operation. */ 

Shuffle_up(nr,w) 

Shuffle_down(w) 
end if 

/* Pivot was successful if sub-walks did not intersect. */ 

Return ! intersection 
End /* Attempt_pivot_fast */ 

Procedure: 
Pseudo_dimerize(5AW-rree w) 

I* Uses merge and pivot operations to generate an initial A^-step SAW in time 0(A^) which is difficult 

to distinguish from a SAW sampled uniformly at random. See Sec. 4 for discussion about why it is 

highly desirable to have such a procedure, and also for analysis of the algorithmic complexity. */ 

/* Generate initial left- and right-hand sub-walks. */ 

Pseudo_dimerize(w') 
Pseudo_dimerize(w'^) 
/* Perform pivot operations on each of the sub-walks while attempting to merge them. When the two 

sub-walks are mutually avoiding, exit loop. */ 

do 

«( <^ n' Random_integer_log(l,n') 

qt <= Random_symmetry() 

Attempt_pivot(w', n,, qt) 

lit <= Random_integer_log(l,n'^) 

qt 'i= Random_symmetry() 

Attempt_pivot(w'^, n,, qt) 

q <^ Random_symnietry() 
whOe lntersect(0, /, w', Xg, q, w^) 
Merge(vv',w'',w) 

/* Perform additional pivots on the SAW, preferentially sampling close to the joint, in an attempt to re- 
duce any sampling bias. We can attempt o(n/logn) pivots without changing the asymptotic behavior 

of the algorithm; we choose to attempt n^'^ pivots. We do not believe that these additional pivots are 

strictly necessary. */ 

for ;■ = 1 to n^l'^ do 

tit <^n'— Random_integer_log(l,n') 

qt <= Random_symnietry() 

Attempt_pivot(w, n,, qt) 

I'*- Now the walks are combined, we have to shift node label by n'. */ 

rit ^n'+ Random_integer_log(0,n' ) 

qt <= Random_syninietry() 

Attempt_pivot(w, n,, qt) 
end for 
Return 
End /* Pseudo dimerize */ 
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2.8 Main program 

Here follows a pseudo-code description of the main program to generate self-avoiding walks via the pivot algorithm. 
Procedure: 
MainO 

/* Create SAW-tree structure. */ 

w <= Generate_SAW-tree(M) 

/* Initialize SAW-tree. */ 

Pseudo_dimerize(w) 

/* Warm up the Markov chain by discarding ndiscard time steps. */ 

for / = 1 to ^discard dO 

Ht <^ Random_integer_uniform(l,n) 
qt <^ Random_symmetry() 
Attempt_pivot(w, n,, qt) 
end for 

/* Perform the computer experiment, collecting data on global observables at every time step. */ 

for / = 1 to Hsample dO 

n, <^ Random_integer_uniform(l,n) 
qt <^ Random_syinmetry() 
success = Attempt_pivot(w, n,, qt) 
Accumulate_statistics(w,success) 
end for 
End /* Main */ 

3 Algorithmic complexity 

Our goal in this section is to calculate the mean time per attempted pivot, T{N), which can in turn be expressed in 
terms of the mean time for successful pivots, Ts{N), and the mean time for unsuccessful pivots, T]j{N). This is done 
as follows: 

T{N) = Pr(successful)rs(A^) +Pr(unsuccessful)ru(A^) 

= OiN-P)TsiN) + Oil)MN) 

= N-"Ts{N) + Tv{N), (26) 

where the probability of a pivot being successful is 0{N^p) as mentioned in Sec. 1.1, and constant factors have been 
dropped. Note that this expression is valid for I? and I?, while higher dimensions are discussed in Sec. 3.5. 

The thread of argument is complicated by the fact that we have two implementations we wish to characterize, 
Attempt_pivot_simple and Attempt_pivot_fast. As the names indicate, we will find that Attempt_pivot_fast is 
asymptotically faster, and so T{N) may be regarded as being the mean time per attempted pivot for the procedure 
Attempt_pivot_fast. 

We first introduce notation and do a minor calculation in Sec. 3.1. In Sec. 3.2 we give a detailed explanation of the 
behavior of the procedure Intersect, and develop a heuristic argument that Ts{N) is 0(logA^). In Sec. 3.3 we show 
that for Attempt_pivot_simple Ti]{N) — &{logN), and therefore T{N) — &{logN), while in Sec. 3.4 we provide 
a heuristic argument that for Attempt_pivot_fast T\j{N) — 0(1), which leads to the prediction that T{N) ~ 0(1). 
In Sec. 3.5 we discuss the complexity of the algorithm for dimensions greater than three, and finally in Sec. 3.6 
we examine the numerical evidence. There is modest numerical evidence supporting our conclusion that Ts{N) is 
0(logA^) for I? and I?, while the numerical evidence suggests T[j{N) = o{\ogN) for I? and T\j{N) = I{\ogN) for 
I?. This is consistent with Ti]{N) = 0(1), but far from conclusive. 

A far simpler version of the argument for our prediction that r(A^) =0(1) for I? and I? is given in the com- 
panion article [4] . 

3.1 Background 

We introduce the notation of the level, I, of a node, which is the number of generations that separate the node from 
the leaves of the tree. The leaves are at level zero, their parents are at level one, etc.. This is non-standard notation, 
as usually the concept of depth is used to describe trees, with the root at depth zero, its children at depth one, etc.. 
For a perfectly balanced tree with 2*^ leaves (sites), the nodes at any fixed level represent sub-walks of the same 
number of sites independent of k, and for this reason we find the level notation to be useful. 
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We will specialize to the case where we have a SAW-tree, W, with n = l'^ sites; this makes things somewhat 
simpler to deal with, but makes no difference otherwise to the algorithmic complexity. W therefore has 2*^ leaves, 
2*^—1 internal nodes, and the level of the root is k. 

The first step of the pivot algorithm is to select a pivot node uniformly at random, from the n — 1 internal nodes 
of W. The average level of this node is 



E(/) = ;^^^Ei2' 
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(27) 

We note that the discussion below is sufficiently general that it applies to related models of polymers in the 
excluded volume limit, such as the Domb-Joyce model, interacting self-avoiding walks, self-avoiding walks on 
other lattices, or self-avoiding walk models in the continuum. SAW-tree implementations of these other models do 
not yet exist. The discussion is valid for dimension d >2, but there will be some subtleties for d >4- which will be 
explored in Sec. 3.5. 

3.2 Successful pivot 

To estimate the expected running time of Intersect, we first need to get a clear understanding of what the algorithm 
does. Intersect recursively tests for the intersection of the bounding boxes of the left and right SAW-trees, and 
searches until either an intersection has been found, or until it has been shown that no intersection is possible (i.e. no 
bounding boxes intersect). The search proceeds by recursively splitting the longer of the left and right SAW-trees, 
and although the action of Shuffle_up on the SAW-trees complicates things somewhat, the comparison typically 
occurs between left and right SAW-trees which are of the same length, up to a factor of two. We note that the search 
method chosen is depth-first search, which minimizes memory usage, and also that the choice of splitting procedure 
is a key feature in the performance of the algorithm. By choosing to split the larger walk, and then first testing for 
intersection between the walks on the left and right hand sides which are closest together on the chain, this means 
that intersections are more likely to be found rapidly, allowing the search to terminate after only a small portion of 
the walk has been examined. 

For convenience we will only consider testing for intersection between the left- and right-hand sides of W, i.e. 
we assume the pivot node is the root of the SAW-tree. This will simplify the discussion of the algorithm, but the 
conclusions drawn will be valid for any balanced SAW-tree, with any number of sites. 

We will first examine the behavior of Intersect in the case that a pivot attempt is successful, when there is no 
intersection. Later we will consider what happens when an intersection is found. 

When appHed to W, Intersect compares sub-walks on the left-hand side at level j with sub-walks on the right- 
hand side at levels j and 7 '+ 1 . When bounding boxes overlap, the search proceeds to the next lowest level in the 
tree, until there are no overlaps. The nodes that this search procedure visits within W induces a binary tree W' with 
internal nodes that are sub-walks whose bounding boxes have been found to overlap sub-walks on the other side, and 
whose leaves are sub-walks which do not overlap sub-walks on the other side. As W' is a binary tree, the number 
of leaves is exactly one more than the number of internal nodes. The algorithmic complexity of Intersect, Ts{N), 
is the number of intersection tests performed between sub-walks on the left- and right-hand sides of W' (neglecting 
constant factors). 

We seek to simplify the characterization of the counting problem, by first noting that the number of intersection 
tests which involve leaves of W' is at most four times the number of intersection tests involving nodes of W', as we 
only test for intersection between two sub-walks when both of their parents intersect. Thus 7s (A^) is asymptotically 
equal to the number of pairs of sub-walks with overlapping bounding boxes, where sub-walks on the left-hand side 
are at level j, and sub- walks on the right-hand side are at levels j and j + 1. We then note that the number of pairs 
of sub-walks with overlapping bounding boxes with these rules are at most twice the number of pairs of sub-walks 
with overlapping bounding boxes on the same level. Ts{N) is therefore given by the number of pairs of sub-walks 
on the left- and right-hand sides of W with overlapping bounding boxes which are at the same level. 

We introduce a new notation, describing the overlap between the bounding boxes of two sub-walks at level j in 
W as a "/-approach", while any overlap between boxes of sub-walks which are at the same level we denote as an 
"approach". Thus Ts {N) is given by the expected number of approaches between the left- and right-hand sides of 
W. 

We give an example of a SAW with 16 sites in Fig. 6, and show its corresponding SAW-tree in Fig. 7, with 
arrows drawn between sub-walks on the left- and right-hand sides which approach each other. 
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Figure 6: cOb, a self-avoiding walk of 16 sites. 




Figure 7: A SAW-tree representation of COt,, with arrows drawn between nodes on the left- and right-hand sides 
which approach each other. 

To this point our argument is exact, but we must now resort to heuristic arguments to count the expected number 
of approaches for W. 

We first note that although there are typically 0{N) nearest neighbor contacts for a SAW of length A^, the number 
of contacts between two halves of a SAW is typically (9(1), as shown via renormalization group [18] and Monte 
Carlo [2] methods. When we attempt to pivot part of a SAW, it is guaranteed that each of the two sub-walks remain 
self-avoiding, and hence we only need to determine if the left- and right-hand sub-walks intersect. If the resulting 
walk is self-avoiding, then we expect, on average, that there will be a constant number of contacts between the two 
sub-walks. 

We now consider the renormalization group transformation along the polymer chain described by Kremer et 
al. [11] in the context of a Monte Carlo renormalization group calculation (see also [6, 20]), which maps a polymer 
chain with hard sphere interactions to a new chain by rescaling the number of monomers (n), the separation between 
monomers (L), and the hard sphere diameter (D), so as to keep the mean-square end-to-end distance fixed. The 
interaction strength is defined as 5 = D/L. Kremer et al. concluded that this renormalization group transformation 
converged to a non-trivial (SAW) fixed point, with interaction strength 5* > 0. In the neighborhood of the fixed 
point the renormalization transformations are 



L' — As^L, and 
D' =As''D, 



(28) 

(29) 
(30) 



for some positive constant A. 

We can translate this to our SAW-tree representation of a self-avoiding walk with 2* sites if we regard the sub- 
walks of each node at a particular level in the tree as the monomers of a renormalized self-avoiding walk. We 
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consider the transformation that occurs as we pass from level j to level j + 1, with 1 ^ 2-* <C 2*^. Sub-walks at 
level j have rij — 2*^^ monomers, and as each sub-walk contains 2-' sites the mean separation between monomers is 
therefore Lj —Ai{2Jy for some constant Ai. The corresponding quantities for level j +1 are 

nj+i=2'--'-' = "^, (31) 

L,+i=Ai(2^'+')^=2%-. (32) 

We note that the expectation of the perimeter of the bounding box of a self-avoiding walk with n monomers is A2n^ 
for some constant A2 (see Eq. 13 in Sec. 2.2 for the definition of the perimeter). If we take the (effective) hard 
sphere diameter at this particular level j to be some constant multiplied by the expected perimeter at level j, Pj, 
Dj =Aj,Pj = A3A2(2-')*', and we pass to level y + 1, then in order to keep the interaction strength fixed we must have 



Dj+i 


= 2^D; 




= 2M3P, 




= 2M3A2(2>)^ 




= A3A2(2^-+i)^ 




^AiPj+i, 



(33) 

i.e. the new effective hard sphere diameter is given by the same constant A3 multiplied by Pj+i, and hence by 
induction D,„ = Aj,P,„ Vm > j. Thus the nodes at level j in a SAW-tree of total depth k, where 1^2^^ 2*, may be 
regarded as a self-avoiding walk with 2*^~^ monomers, separated by links of length Lj = Aj (2-')*', with hard sphere 
monomers of diameter Dj — A^Pj. This renormalization procedure becomes exact in the limit of large j and k . 

We now note that approaches for the SAW-tree correspond to contacts for the renormalized walks, where the 
minimum distance at which two monomers of the walk are said to be in contact is a fixed multiple of the hard sphere 
diameter. For fixed values of the interaction strength parameter and contact distance the number of contacts between 
the left- and right-hand sides of a walk is of (9(1), and we now know that as we proceed up the tree from the leaves 
to the root we are converging to a fixed value of the interaction strength. Thus the expected number of contacts 
between the two halves of the walk also converges to a constant. 

Let the expected number of approaches at level j for a walk with n sites be c„j. The previous arguments imply 
that lim„^„oC„., = Cj for some positive constant Cj, where the sequence is not necessarily monotonic. We then 
expect that the limit limy^ocCy ~ c* exists, and that the Cnj are bounded, i.e. there exists a constant c^ such that 
c„j <c^ VnJ. Thus 

log2« 
1022" 

< Let 
.7=0 

= ctlog2n, (34) 

and so Ts.{N) = 0{logN). In fact, we expect that c„.y w c* for sufficiently large n and j, and so Ts{N) — &{logN). 

3.3 Algorithmic complexity of Attempt_pivot_simple 

On each iteration Attempt_pivot_simple executes the procedures Shuffle_up and Shuffle_down as well as the 
function Intersect. 

The expected running time of both Shuffle_up and Shuffle_down is independent of whether the pivot attempt 
is successful or not. From Sec. 3.1, the expected level of a SAW-tree node selected uniformly at random is two. 
Thus the expected number of tree-rotations that Shuffle_up must perform in order to bring a pivot node to the top 
of a SAW-tree with k levels, or Shuffle_down must perform to restore the node to its correct place in the tree, 
is k — 2 + 0{k/2''). Given that k ~ log2n, the algorithmic complexity of both Shuffle_up and Shuffle_down is 
therefore 6>(logA^). 

We already know Intersect takes time ©(logA^) in the case that the pivot attempt is successful, which combined 
with the 0(logA^) complexity of the shuffle operations gives 

TsiN) = &{logN). (35) 
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When the pivot attempt is unsuccessful and an intersection is found. Intersect terminates early and thus takes 
time 0{logN) (we will argue in Sec. 3.4 that it is in fact (9(1), but this makes no difference to the argument here). 
Combined with the 0(logA^) complexity of the shuffle operations this results in 

Tu{N)^&{\ogN). (36) 

For Attempt_pivot_siniple we therefore have overall algorithmic complexity of 

T{N)=N-PTs{N) + Tv{N) 

= N-P&ilogN) + &{\ogN) 

= @{logN). (37) 

3.4 Algorithmic complexity of Attempt_pivot_fast 

On each iteration Attempt_pivot_fast executes the function Shufflejntersect, while it only executes the procedures 
Shuffle_up and Shuffle_down when a pivot attempt is successful. 

The key point to understand regarding the action of Shuffle_intersect, is that this function shuffles the pivot 
node up the SAW-tree only as far as necessary to find an intersection between the left- and right-hand sub-walks. In 
the case that the pivot attempt is successful, Shufflejntersect must therefore shuffle the pivot node up to the top 
of the SAW-tree, but in the case where the pivot attempt is unsuccessful the pivot node is only shuffled up just far 
enough to be able to find this first intersection. 

When the pivot attempt is successful, Shufflejntersect must search the whole SAW-tree for intersections, and 
so is exactly equivalent to performing Shuffle_up, Intersect, and Shuffle_down. As shown in Sections 3.2 and 3.3, 
all of these procedures take time 0(logA^), and thus 

Ts{N) = &{\ogN). (38) 

The situation is considerably more complicated when the pivot attempt is unsuccessful and an intersection is 
found. We suppose that the pivot node is the y* (internal) node from the left, which has j sites (leaves) to the left, 
and n — j sites to the right. Shufflejntersect will then search W until it finds the intersection with j — // being 
the label of the left-hand site and j + ir the label of the right-hand site such that sup(//, iV) is minimum. Note: this 
description is approximate, as the SAW-tree structure complicates the issue of exactly which pair of intersecting 
sites is found, and the actual // and /,- which are found may be different from the optimal values. However, we are 
confident that for the average case this argument is correct up to a constant factor in the size of ;'/ and iy 

The sites j — ii and j + ;V are exactly the pair of intersecting sites which would be discovered in the usual 
implementation of the pivot algorithm [17], where the new walk is incrementally built by moving outwards from 
the pivot site. Let us assume, without loss of generality, that /,. > /;. We define Pr(/) as the probability that / is 
the intersection that is found, i.e. the conditional probability that the sub-walk with sites [j — i,j + i] has a self- 
intersection, but there are no intersections between sites in the open interval [j — i,j + i). Madras and Sokal [17] 
convincingly argued that the expected value of /,- is 

E(/,) = ^ /,Pr(/,) 

= 0{N^-P), (39) 

with p positive and close to zero for 7? and Z^ . 

The interval [j — ii,j + ;V] corresponds to a self-avoiding loop, and it may be quite difficult to characterize the 
probability space of these configurations. We strongly suspect that these configurations are sufficiently close to self- 
avoiding walks so that the mean number of approaches between the intervals [j — ii,j) and [j,j + /,] is (9(log;V). 
However, we do not have an argument to support this statement, and so we now make the assumption that the mean 
time for Shufflejntersect to confirm that the sub-walk {j — ir,j + ir) is non-intersecting is (9(log/,). To then find 
the intersection between j — i[ and j + i^ will take additional time of (9 (log iV) • 

We now take j to be a typical node, which places it near the middle of the walk with j w 2^^\ and at level two 
in the SAW-tree. To determine the expected time to find the intersection between // and ir, we need to determine the 
probability distribution for ir, given j = 2*^^' = N /2 (fixing j makes the argument simpler, but does not change the 
result). Eq. 39 suggests that for sufficiently large /,, Pr(iV) = 0{ir '')■ We therefore neglect sub-dominant terms 
and take the leading order approximation Pr(;V) ~ A//, ''; this may be a very bad approximation for small ir- 
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Using this approximation, we can now proceed to determine Tjj (A^) : 

N/l 

T\j{N) — y\ (Time to find intersection) Pr(/r) 

N/2 4 

= Ilog'V:TT;^- (40) 

We observe that for large A^ T]j{N) approaches a constant from below, i.e. Tu{°o) = 0(1). To determine the rate of 
approach, we examine T\j{°o) — T\j{N): 



JN/l i+P 



' 'D {•'< 



21 . , yP^Og- + l] (41) 



Without detailed understanding of the exact form of Pr(iV), it is impossible to estimate T\j{o°) — T\j{N) for any par- 
ticular value of A^. However, we can gain a rough idea of the rate of approach to the constant T\] (oo) by determining 
a natural length scale, L, from the solution of 

Moo) -ML) ^ 1 

7bH-7b(l) e' 

i.e. L is the length at which our approximation for the deviation from the limiting value has decayed to 1/e of its 
initial value. For I?, ;? sa 0.19 which gives L sa 1.6 x 10^, while for I?, p w 0.11 and so L w 5.8 x 10^. We can 
clearly see that convergence to constant behavior may indeed be very slow, particularly for I? . 
For Attempt_pivot_fast we therefore have overall algorithmic complexity of 

T{N)=N-PT^{N) + MN) 
= 0{N-nogN+\) 
= 0{\). (43) 

In practice, sub-leading terms may result in behavior which is quite different from (9(1) for lengths of the order of 
millions or even billions of steps. 

3.5 J > 3 

For d>3 the time spent on successful pivots is non-negligible, and this changes the overall mean time per attempted 
pivot. 

As discussed by Madras and Sokal [17], we expect that the exponent associated with the acceptance fraction, p, 
is of the same order of magnitude as the critical exponent 7, with a heuristic argument that /? ^ 7— 1. It appears to 
be the case for 1? and 1? that /:'<7— l;ifp<7— 1 holds for d > 3, then this implies that p is zero for d>A, with 
perhaps a logarithmic correction for d = A. 

For c/ = 4, the number of self-avoiding walks of length A^, cat, is given asymptotically by ca^ ^ A (logA^) ' /i'^. 
Therefore, the probability of being able to successfully merge two SAWs (see Sec. 2.1 for definition of the merge 
operation) of A^ steps (A^+ 1 sites) to form a SAW of length 2A^+ 1 (2A^ + 2 sites) is 

C2N+1 A(log(2A^+l))'/V'+' 



4 A^{\ogN)^'^ll^^ 

= 0({\ogN)-'''). (44) 



We have already observed that 7— 1 ^ p for 1?, and so we expect that the exponent of logA^ in Eq. 44 is unlikely 
to be the same as for the probability of a successful pivot. A plausible guess for the probability is Pr(successful) = 
0((logA^)^'^), with < fc < 1. We have performed a preliminary analysis of the data for the acceptance fraction, 
/, for if', which suggests that K < 0.42; however, the plot of log(log(— /)) versus loglogA^ has not settled down 
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to linearity by A^ = T?^ — 1 w 2.7 x 10^, and so the error on this estimate may well be quite large. Assuming this 
functional form is correct, we obtain 

Pr(/.) = -j^(log/,r' 



o(i-\\ogirr ''-'■') (45) 



Tb (A^) « I log /, i- 1 (log /,) - ""- ' dir 

rN 

= !7' {logi,-y^di,- 

= o((logA^) '-''). (46) 

Assuming that < fc < 1, we have 

r(A^) = Pr(successful)rs(A^) +Pr(unsuccessful)ru(A^) 

= {\ogN)-''TsiN) + MN) 

= O ((logA?)"''logA?+ (logA?)'"") 

= of(logA^)'-'^). (47) 



Note that the time spent on successful pivots is of precisely the same order as the time spent on unsuccessful pivots. 
Once again, we mention that the above expression is based on a plausible but untested assumption. We are confident 
that r(A^) = cu(l), and also r(A^) = o(logA^) (this is the expression reported in Tables 1 and 3). 

For I? and I?, we could see that as p — > 0, the natural length scale L obtained from Eq. 42 diverges. For this 
reason, we expect that deviations from the leading order behavior of r(A^) for Z'*, where p = with logarithmic 
corrections, may be large even for extremely long walks with lO'^ steps or more. 

For (i > 4, we can use the mean-field approximation and ignore long range correlations within the chain, by 
assuming that a sub-walk of length A^ consists of sites selected uniformly at random from a sphere of radius r = 
0{N^) — 0{N^'^). Assuming that we already have a self-avoiding sub-walk with sites in the interval (j — /, , j + ;V), 
then the probability that the site j + i^ intersects with one of the previous sites is Pr(/,.) = 0{ir/r'') = 0{ir ' )■ 
Therefore 

Ti]{N) w / logirir dir 

= 0(1). (48) 

Finally, we have 

r(A^) = Pr(successful)rs(A^) +Pr(unsuccessful)ru(A^) 
= 0(1)7s(A^) + 0(1)71j(A^) 
= 0(logA^+l) 
= 0(logA'). (49) 

As 7s(A^) = ©(logA'), this leads to the stronger statement that T{N) = ©(logA^). Note that the complexity of the 
algorithm is now dominated by the time spent on successful pivots. 

3.6 Numerical evidence 

In Fig. 8 we present T{N) for Attempt_pivot_fast (Z^ and Z^), and Attempt_pivot_simple (Z^), for lengths from 

A^ = 2^ - 1 to A^ = 2^*^ - 1 « 2.7 X 10^. In Fig. 9 we present 

AT{N) = T{nV2)-T{N/V2); (50) 

the domain of A^ for this plot is reduced due to the crossover in algorithm performance which occurs at N ~ lO'*. 
These estimates for T{N) were obtained in a separate data run from the main computer experiment in the companion 
article [4]. The computers used were SunFire X4600M2 machines with 8 quad-core AMD Barcelona CPUs with 
clock frequency 2.3GHz, and 64 GB memory. 
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In Fig. 8, the plot for Attempt_pivot_simple appears to be linear, and this is verified in Fig. 9 where AT" appears 
to be constant, providing support for the prediction that T{N) = 0(logA^). The plot for Attempt_pivot_fast for Z^ 
appears to be consistent with eventually approaching a constant; in Fig. 8 we see that T{N) is visibly curved for 
A^ > 10^, and is plausibly o(logA^), while in Fig. 9 we observe that AT{N) appears to be steadily declining. The final 
point increases somewhat, but there is an anomalous increase for all three data sets which strongly suggests this is 
an artifact. For the final data point a large fraction of machine memory was used, and so it is very likely that some 
limit with the hardware was reached, causing degraded performance. For I?, there is no curvature visually apparent 
in Fig. 8 which suggests r(A^) — 0{logN), but if we examine Fig. 9 we see weak numerical evidence supporting 
T{N) = o{logN\ due to the dechne in AT{N) forN> 10^ 
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Figure 8: T{N) for Z^, and Z^, where for Z^ data were collected for both the simple and fast versions of Attempt- 
_pivot. 
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Figure 9: AT{N) for I? and I?, where for I? data were collected for both the simple and fast versions of Attempt- 
_pivot. 



Rather than directly measuring the run-time of a computer experiment, it is also possible to directly measure the 
mean number of intersection tests required, I{N), per attempted pivot for SAWs of A^ steps. For Attempt_pivot_fast, 
we can relate T{N) to I{N) as follows; 



Ts{N)=logN + Is{N), 
Tv{N)=HN), 



(51) 
(52) 
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where /s (A^) and In (N) are the mean number of intersection tests when a pivot attempt is successful and unsuccessful 
respectively. Determining I{N) is far cleaner than directly measuring T{N), which is affected by hardware, and it is 
also straightforward to separately determine Is{N) and I[j{N). 

We have performed another computer experiment where we measured /s(A^), Ii]{N), andI{N), with lengths from 



N = 7 toN = 2 
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1 w 2.7 X 10'*. In each case the initial 10 configurations were discarded (independent of N), 



and then 100 batches of 10** configurations were generated. The data from this computer experiment is presented 
visually in Figs. 10-15, where 



M{N) = IiNV2) - I{N/V2) . 



(53) 



This procedure does not properly initialize the Markov chain for large A^, but we have checked the resulting time 
series for any evidence of systematic errors, and we are confident that the systematic error is small compared to the 
statistical error. We do not show the statistical errors in Figs. 10-15, but the largest errors are smaller than the data 
point symbols; the small amount of scatter visible in plots of AI{N) is consistent with these errors. 

We note in passing that the mean number of intersection tests required per attempted pivot is remarkably low: 
for A^ = 2^*^ - 1, 1{N) is 39 for Z^, 158 for Z^ and 449 for Z^. 

Examining the evidence for I? from Figs. 10 and 11, we can see clearly that Is{N) is convex up to A^ = 2^'^ — 1, 
and SisN ^°° the lin-log plot of /s (A^) smoothly approaches a straight line strongly supporting the statement Is (N) = 
&{logN). This leads to Ts,{N) ~ &{logN), in accordance with our conclusion in Sec. 3.2. The evidence for/u(A^) is 
less conclusive: I\j{N) is convex for small A^, but becomes concave once a threshold is reached. We do not feel that 
it is likely that there is an additional length scale beyond A^ = 2^^ — 1 where the behavior of I\j (N) would change, 
and therefore believe that the most likely scenario is that A/u(A^) will decay smoothly to zero, which would imply 
I\]{N) — o{logN). This argument is by no means conclusive, but it is consistent with the prediction from Sec. 3.4 
that/u(A^) = 0(l). 
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Figure 10: /s(A^) and I\j{N) for Attempt_pivot_fast on Z^. 

The evidence for I? from Figs. 12 and 13 is less clear In Sec. 3.4 we argued that for I? the asymptotic regime 
would be reached at far greater lengths than for I?, and this certainly appears to be the case. The graphs for Is{N) 
suggest that Is{N) — &{logN), but we do not believe the evidence is particularly strong, as there is no extended flat 
region for A/s (A^) (as can be seen for I?). The plots for /y (A^) show that /u (A^) is convex for A^ up to approximately 
A^ = 10^, before becoming concave. If the downward trend in AI\j{N) were to continue in the same manner as for 
7?, then this would imply I\j{N) = o{\ogN). However, the numerical evidence supporting I\j{N) ~ o{\ogN) is weak 
at best, while a stronger case can be made on the basis of the numerics that I\j{N) = 0{logN). 

For Z'*, the evidence from Figs. 14 and 15 appears to support the statements Is{N) ~ w{\ogN) and /u(A^) — 
w{\ogN), as the plots for both Is{N) and I\j{N) are convex at A^ = 2^^ — 1. This directly contradicts the predictions 
that Is{N) = 0(logA^) and Iu{N) = o{\ogN). However, this is not too surprising, as for Z"* we have p — with 
logarithmic corrections, and so the asymptotic regime may not be reached until A^ is truly large, i.e. A^ >> 2^^. 

In summary, the numerical evidence supports our heuristic argument that Ts{N) = &{logN) for I? and to lesser 
extent for Z^ We argue that the data for /u(V) and TyjiN) imply Txj{N) = o{logN) for Z^, and ru(A^) = <9(logA^) 
for Z^; the numerics are consistent with predictions that Tu{N) — 0(1). However, the asymptotic regime has not 
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Figure 12; Is{N) and Iv{N) for Attempt_pivot_fast on Z^. 

yet been reached at A^ = 2^^ — 1, and so no strong conclusion regarding T\j{N) can be made based on the numerics. 
For Z^ the numerical evidence does not support the conclusions from our heuristic arguments; we believe that this 
is due to the asymptotic regime being well beyond A^ — T?^ — 1 . 

We wish to make one final point; although the performance of an (9(1) implementation must eventually be far 
superior to the performance of an 0(logA^) implementation, we note that for A^ as high as 2.7 x 10^ the speed-up 
gained for 1? is only a factor of 1 .65 for Attempt_pivot_fast versus Attempt_pivot_simple. This factor can be 
expected to grow, but even for walks on H? with lO'^ steps, Attempt_pivot_fast will be only a factor of two faster 
than Attempt_pivot_simple. 

3.7 Summary 

Although r(A^) is asymptotically smaller for H? and 71? than for higher dimensions, this does not mean that our 
implementation is more efficient for Z^ and T/} than for higher dimensions, as it is the integrated autocorrelation 
time in CPU units for global observables which is important. The integrated autocorrelation time in physical units 
(pivot attempts), Tint(A^), for observables such as R^ is of approximately the same order as the time needed to achieve 
a successful pivot, perhaps with an additional logarithmic factor [17]. The best available evidence suggests that there 
is no logarithmic factor for 71? and Tl? [17, 15], but there is as yet no evidence on this question for Z'' with c/ > 4. We 
define Tint(A^) as the integrated autocorrelation time in CPU units, which can be expressed as Tint(A^) = T\nt{N)T{N). 
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Figure 13: Ms{N) and Mx]{N) for Attempt_pivot_fast on Z^ 




Figure 14: Is{N) and Iv{N) for Attempt_pivot_fast on Z'* 



Neglecting constant factors, the relevant expressions for r(A^) and Tint(A^) are therefore 

T{N) = Pr(successful)rs(A^) +Pr(unsuccessful)ru(A^); 
Tint(A^) = T,„iiN)T{N) 

T{N) 



Pr(successful) 



(54) 
(55) 



, , Pr(unsuccessful) , , 
-^^(^)+ Pr(successful) ^"(^)- 

This leads to the estimates for the complexity of T{N) and Tint(A^) for Z'' given in Table 3. 

We observe that our implementation of the pivot algorithm has a crossover at dimension d = 4: foTd< 4, most 
CPU time is spent on unsuccessful pivots, while for d > 4, most CPU time is spent on successful pivots. 

4 Initialization 

As discussed in [17], the pivot algorithm has short integrated autocorrelation time of 0{N''), but long exponential 
autocorrelation time of (9(A^'+'') (p positive, but close to zero). As it is infeasible to initialize the Markov chain 
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Figure 15: Ms{N) and Mxj{N) for Attempt_pivot_fast on Z^. 



Table 3: T{N) and t(A^) for A^-step SAWs on Z''. The expressions for Tint(A^) may have an additional logarithmic 
factor [17]. 



Z'^ If,d>A 



T{N) 0(1) 0(1) o(logA^) 0(logA?) 

%nt{N) 0{N°-^'^) 0(A^O-") 0(logA?) 0(logA?) 



for large A^ by directly sampling from the equilibrium distribution via dimerization [1] (for d >A dimerization 
is sufficiently efficient that it can be used even for very long walks), there will necessarily be a systematic bias 
introduced from the initialization. To ensure that the systematic error is much less than the statistical error, it is 
necessary to discard a number of time steps which is significantly larger than the exponential autocorrelation time. 
Madras and Sokal [17] argued that the exponential autocorrelation time is 0{N/f), where / is the acceptance 
fraction of the pivot algorithm, and discarded the first 20N/f time steps. We adopt the same procedure here. We 
note that for sufficiently large A^ the time for initialization can dominate the running time of the algorithm; for the 
longest walks studied, initialization took approximately 2 weeks of computer time. 

There is an additional complication for our implementation: although we expect the mean time per pivot attempt 
for the SAW-tree implementation to be 0(1), this is not necessarily true for atypical SAWs. In particular, if the 
Markov chain is initialized with a walk with long straight segments, such as a straight rod, then there can be O(A^) 
nearest neighbor contacts between two halves of a walk, and so in the worst case a successful pivot may take time 
0{N), which is the same as the average-case performance of the implementation of Madras and Sokal, and far worse 
than O(logA^). We have not precisely characterized the average-case behavior of the SAW-tree implementation 
when initialized with a straight rod, but it is clear that pivot attempts are far slower when the walk has many straight 
segments. It may be of interest to study the typical performance of the algorithm when initialized with a straight 
rod, e.g. to characterize how rapidly the CPU time per pivot attempt decays to the equilibrium value, and this will 
be done in a future computer experiment. 

To overcome the difficulty involving straight rods, we developed the pseudo-dimerization procedure defined in 
Sec. 2.6, Pseudo_dunerize. This procedure utilizes repeated pivot operations to generate SAWs in time 0(A^) which 
are quite difficult to distinguish from SAWs sampled from the uniform distribution. For A^ = 33554431, 2QN/f 
corresponds to approximately 50 batches of 10^. In Fig. 16, initialization bias is visually apparent for (at most) 
the first 10 batches, which suggests that discarding 2QN/f configurations is quite conservative when initializing the 
Markov chain by using the pseudo-dimerization procedure. If we could argue that the pseudo-dimerization samples 
from a distribution which is in some sense close to the uniform distribution, and if we could quantify this, then it 
might be possible to spend less time on initialization. In the absence of any such argument, we strongly recommend 
a cautious approach, i.e. that 2QN/ f configurations should be discarded at the beginning of each data run. 

Although this has not been carefully tested, it seems that there is no significant increase in the time per attempted 
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Figure 16: Initialization for 3355443 1-step SAWs on I?, in batches of 10^; means for batches of lO' configurations 
are shown as lines. The initial configuration was generated using Pseudo_dunerize. 

pivot during the initialization period when the initial SAW is generated via the pseudo-dimerization procedure. It 
may appear that Pseudo_dimerize is not strictly necessary, but in practice it significantly extends the length of walks 
which can be studied compared to initialization with a straight rod. 

4.1 Algorithmic complexity of Pseudo_dimerize 

We will now provide a theoretical argument as to why pseudo-dimerization takes time @{N). 

The mean CPU time needed for Pseudo_dimerize to generate a SAW of n sites may be expressed recursively 
from the time required to generate two SAWs of «/2 sites. We use the number of steps A^ = « — 1 in the following 
expressions for consistency with other sections, and denote the mean CPU time as Tpd{N). Thus we have: 

?pd(A^) = 2T{N/2) + (CPU time to merge walks) 

= 2Tpo{N/2) + (CPU time per pivot attempt) 

X (Expected number of pivot attempts) (56) 

In Sec. 3 we predicted that the CPU time per pivot attempt for Z^ andZ^ is (9(1), and 0{logN) fori/' withii >2. 
We will use the weaker bound 0{logN) for our argument here. Calculating the expected number of pivot attempts 
is, however, quite subtle. Naively, we can assume that each of the n/2 site walks has been sampled uniformly from 
SAWs of length n/2; we expect that this approximation is "good enough", and will only result in a small error 
that will not change our conclusions. The number of SAWs of A^ steps for Z^ and Z^ is given asymptotically by 
c„ ~ Ajj.'^N'''^^, and so the probability of successfully merging two independent SAWs with n/2 sites is given by 



Pr(merge) 



Cn-l 

'■n/2-1 

(A^«/2-i(„/2-l)r-i)^ 



(57) 



However, successive SAWs in the Markov chain are highly correlated, and we account for this correlation by intro- 
ducing an observable C, defined as 



Cico',co'') = 



1 (0 and co'^ can be merged 
co' and co'^ cannot be merged. 



(58) 



The expected number of pivot attempts before the left and right sub-walks are successfully merged is Tint(C)/(C), 
and we know that 

(C)=Pr(merge) = C>(A?^"''). 
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In our implementation of Pseudo_dimerize we perform an additional 0{N^'^) pivot attempts after the sub-walks 
have been merged, and hence 

TpoiN) = 2TpoiN/2) + OilogN)iN^-'TMiC) +N^/^) 

= 2Tpd{N/2) + OiNT^-' logA?)T,nt(C) + 0{N'^^\ogN). (59) 

Eq. 57 leads to the conclusion that the mean distance of the first intersection between the left- and right-hand 
sub-walks from the joint where the walks meet is 0{N^^'^). However, there may well be other length scales which 
are also important for the calculation of Tint(C); in particular, the shape of the walks near the joint at distance scales 
from (9(1) up to 0{N^^'^) will strongly affect the probability of successful merging. This will be discussed in much 
greater detail in a subsequent paper, in the context of our calculation of the critical exponent 7 for self-avoiding 
walks. 

We expect that Tint(C) is of at most the same order as the time needed to achieve a successful pivot on all possible 
length scales, i.e. the time to achieve a successful pivot in each of the ranges, in terms of distance from the joint, 
of [1,2), [2,4), [4,8), • • • , [n/2,n — 1). By choosing pivot nodes using Randoin_integer_log, pivot sites are chosen 
uniformly at random on a logarithmic scale in terms of their distance from the joint, and so in time 0{logN) pivot 
attempts are made on all length scales. When selecting the pivot site uniformly at random (on a linear scale), the 
probability of a successful pivot is 0{N^^). We expect that a pivot attempt is more likely to be successful if the pivot 
site is near one end rather than close to the middle, and thus for pivots selected on a logarithmic scale the probability 
of a successful pivot will remain 0{N^''). Altogether, when using Randoin_integer_log to select pivot sites, this 
implies that Tint(C) = 0{N''logN). Thus, our final recurrence relation is 

Tpo{N)^2TpD{N/2)+0{N^-^+''log^N) + 0{N^/^-logN). (60) 

For A < 1, the solution of the recurrence relation Tpu{N) == 2Tpo{N/2) + 0{N^ log^N) + 0{N^I^\ogN) is 
TpY){N) = &{N). The sequence of approximations used leads to the estimate Awy— l+/9ss 0.53 for Z^ and 
A w 0.27 for Z^; we are confident that our approximations are sufficiently accurate that the correct A will be less 
than one, and hence that the expected running time of Pseudo_dunerize is indeed &{N). This argument can be 
straightforwardly repeated for Z'', li > 4, leading to the same conclusion that rpD(A^) — &{N). 

We have clear numerical evidence from informal computer experiments that this result is correct. 

5 Error estimates and the autocorrelation function 

Following [15], and given the variance of an observable, var(A) = (A^) — (A)^, we define the autocorrelation function 
for the time series measurement of an observable A as 

PAA[t) = 77^^ ■ (61) 

var(A) 

We have calculated the autocorrelation function for the Euclidean-invariant moments 7?^^ with xG{e,g,m},l< 
A; < 5, for A^ = 2' - 1, 9 < / < 22, for times t < 8192. We invested approximately 300 hours of CPU time in this 
endeavor, a relatively small amount compared with the 16500 CPU hours spent on the computer experiment to 
determine v in [4]. The autocorrelation functions for R^, with A^ = 51 1 and A^ — 2p- — 1 = 4194303, are shown in 
Fig. 17. Error bars are not shown on the graph, but the approximate size of the errors can be inferred by the degree 
of scatter from smooth behavior 

Despite the apparent linearity observed in Fig. 17 (particularly for A^ — 4.2 x 10^), it is surprisingly difficult to 
extract reliable estimates for the rate of decay of the tail of the autocorrelation function. Perhaps this is because the 
tail is not characterized by a single exponent. Madras and Sokal [17] showed that the pivot algorithm itself has a 
variety of exponents for the acceptance fraction for different classes of lattice symmetries, and this behavior may 
extend to the autocorrelation function itself This problem certainly deserves further study, but we do not have data 
of sufficient quality to be able to accurately characterize the autocorrelation function. 

In Fig. 18, we calculated the integrated autocorrelation time for /?g using two different methods. 

For the direct method, we calculated 

Tint(A) = - + £pAA(0 (62) 

^ t=\ 

by using direct summation for short times (t < 128), and fitting the intermediate regime with a power law truncated 
at exactly / = N/f (where / is the fraction of pivot attempts which are successful). The accuracy of this method 
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Figure 17: Autocorrelation function for observables R^, Rz, and ^^. See Eq. 22 for a definition of ^j^; the key 
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Figure 18: Tint for R^. 

relies on the assumptions that the exponential autocorrelation time is of 0{N / f), and that the intermediate regime 
of the autocorrelation function can be adequately fitted by a single power law. We neglect the regime t > N/f where 
p{t) is presumed to decay exponentially, as the contribution of this tail to Tint is negligible compared to the error 
introduced by other approximations, e.g. the choice of f = N/f rather than t — cN/f with c j^ I. 

The indirect method used the batch estimates for the observables, and corresponding confidence intervals stdev(A), 
by solving 



stdev(A) = 



2Tint(A)var(A) 

'^sample 



(63) 



for Tint (A). The accuracy of this technique relies on the assumption that the batch error estimate is accurate, which 
in turn relies upon the degree of correlation between successive batches being negligible. Provided the exponential 
autocorrelation time is finite (guaranteed for a finite system), then this condition will be satisfied for sufficiently 
large batch size. If this condition were not satisfied then estimates of stdev(A), and consequently Tint(A), would be 
systematically low. To estimate Tint for R^, with x e {e,g,m}, we used data from the companion article [4], with 
1000 batches of 10^ pivot attempts, and 125 batches of 8 x 10^ pivot attempts. 

The two indirect estimates shown in Fig. 18 are indistinguishable, although there is more scatter for the batches 
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of 8 X 10^ because of the smaller number of batches used for the estimate of stdev(A). This is strong evidence that 
a batch size of 10^ is sufficiently large so that the degree of correlation between successive batches is negligible up 
to at least A^ = 2^^ — 1 w 3.36 x 10^. Hence, we expect the confidence intervals for estimates of observables in [4] 
to be accurate, and recommend the batch method for use with the pivot algorithm as a simple and reliable method 
for estimating confidence intervals. 

The direct estimates for Tint are close to the indirect estimates, but for sufficiently large A^ the direct estimates are 
systematically low. This suggests that either the tail fitting procedure breaks down for large A^, which we consider 
unlikely as it is clear from Fig. 17 that the tail has little curvature for large A^. Or, for large A^ the truncated part of the 
tail still contributes non-negligibly to the integrated autocorrelation time, i.e. the exponential autocorrelation time 
is greater than 0{N / f). We consider this latter explanation to be more probable, and will explore the asymptotic 
behavior of the exponential autocorrelation time in future work. 

We refer the interested reader to [17, 15] for more information on the autocorrelation function for the pivot 
algorithm. 

6 Performance: comparison with other implementations 

In this section we present detailed comparison of the performance of the S AW-tree implementation with the imple- 
mentations of Madras and Sokal [17] and Kennedy [9], for A^-step SAWs on 1?, 1?, and Z^, with A^ ranging from 3 
to 33554431. 

For this section we will use the shorthand notations S-t for the S AW-tree implementation, M&S for the hash 
table implementation of Madras and Sokal, and K for Kennedy's implementation. 

6.1 Experimental details 

For testing M&S we wrote our own version using the programming language C. This implementation has not been 
extensively polished for maximum efficiency, and so it is highly Ukely that there exist other implementations which 
are faster by a (small) constant factor. 

For testing K, we used Kennedy's C++ program SAW_pivot vl.O, which has been released under the GNU Gen- 
eral Public Licence. For information about this implementation, we recommend you consult the relevant article [9], 
as well as the source code^. We used the default settings for SAW_pivot, which meant that updates to the data 
structure were performed every A^pivot = [(A^/40) '' ^J successful pivots (we set A^pivot = 1 for A^ < 40). Kennedy [9] 
indicates that the performance of the algorithm is relatively insensitive to the precise choice of A^pivot, and so we 
expect that tuning A^pivot would result in, at most, only modest improvement in performance. 

We have observed that K is faster when walks are rod-like, presumably because the intersection testing algo- 
rithm is more efficient when SAWs are spread out^. For this reason the timing experiments were initialized with 
SAWs sampled from the equilibrium distribution, rather than straight rods; these SAWs were generated via the pivot 
algorithm using the SAW-tree implementation. 

The computer experiment was performed on a 64-bit Linux machine with Xeon Barcelona 2.83GHz quad-core 
processors. The programs were compiled with gcc version 4.1.2, with optimization flag "-03"; other optimization 
flags made little if any difference. 

We do not include error bars for our measurements, but we have repeated the experiment to ensure that these 
numbers are reproducible, and have verified that the deviations between different runs are quite small. We present 
our results in Table 4 of Appendix B and graphically in Figs. 19-21. 

6.2 Discussion 

In Fig. 19, and to a lesser extent in Figs. 20 and 21, a kink is visible in each of the curves, indicating the length of 
walk where a hardware limit is reached and the computer programs become memory bound. This occurs at shorter 
lengths for S-t and M&S, as our implementations of these algorithms use significantly more memory than K. 

In [4] we made the statement "For SAWs of length A^ = 10^ on the cubic lattice, the performance gain for our 
implementation is approximately 200 when compared with Kennedy's, and over a thousand when compared with 
that of Madras and Sokal". This statement was based on testing of an earlier version of our SAW-tree implementa- 
tion, on a different computer to the tests reported here, and we regard the figures in Table 4 of Appendix B as more 
reliable. In this table, we find that for SAWs on I? with A^ = 1048576, S-t is 385 times faster than K and 3830 
times faster than M&S. 



^Available at http : //math, arizona.edu/~tgk/. 

Interestingly, this is in starlc contrast to the behavior of the SAW-tree implementation, where long straight segments in rod-like walks result 
in significant peiformance degradation, as explained in Sec. 4. 
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Figure 19: T{N) for the Madras and Sokal, Kennedy, and SAW-tree implementations on 7? 
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Figure 20: r(A^) for the Madras and Sokal, Kennedy, and SAW-tree implementations on Z^. 

For this computer experiment the only observable calculated was R^; it is straightforward to extend this to other 
observables such as rI and /?^ for the S-t and M&S implementations for a constant factor penalty. It is likely also 
possible to do the same thing for K, but despite the clear performance advantage for this algorithm over MS S, to the 
best of our knowledge this has not been done. 

We observe from the table and graphs that S-t is lightweight, as it is comparable with M& S for short walks with 
as few as 15 steps. For Z^, Z^, and Z"*^, S-t is in fact faster than the other implementations for 63 or more steps. 

The difference between the implementations is particularly stark for A^ > 10^, where S-t is significantly faster 
than the other implementations in all dimensions; this improvement is quite dramatic for Z^ and Z'*. 

On the basis of this computer experiment, where the precise timings are compiler and machine dependent, we 
can nevertheless draw the following robust conclusions: the SAW-tree implementation is efficient for short walks, 
and much more powerful than other implementations for long walks. Compared with Kennedy's implementation, 
there is a large performance boost for Z^, and a dramatic performance boost for Z'' with d>3. 
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Figure 21: T{N) for the Madras and Sokal, Kennedy, and SAW-tree implementations on Z"* 



A Example SAW-trees 




Figure 22: SAW-tree which is precisely equivalent to the pivot sequence representation for a walk with n sites. Note 
that ^(0) is an overall symmetry which is applied to the whole walk, and cannot be directly included in the SAW-tree 
data structure. 
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Xe=(3,0) 



e=[i,3]x[o,i] 



1 
1 
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B'=[1,2] x[0,1] 
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Figure 23: A SAW-tree representation of cOa (from Fig. 1) involving proper rotations only. 



Xe = (3,0) 

B=[l,3]x[0,l] 



q'=q{l) = 



1 

1 



B'=[1,1]x [0,1] 




X^=(2,l) 
q'-=q{3) = 



1 
-1 



B''=[l,2]x [0,1] 




X-=(l,-l) 
B''-=[l,l]x [-1,0] 





Figure 24: A SAW-tree representation of cOa (from Fig. 1) involving proper and improper rotations. 

B Running times of SAW-tree, Madras and Sokal, and Kennedy imple- 
mentations of the pivot algorithm 
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